The DevOps Handbook, Second Edition by Gene Kim & Jez Humble & Patrick Debois & John Willis & Nicole Forsgren

The DevOps Handbook, Second Edition by Gene Kim & Jez Humble & Patrick Debois & John Willis & Nicole Forsgren

Author:Gene Kim & Jez Humble & Patrick Debois & John Willis & Nicole Forsgren
Language: eng
Format: azw3
Publisher: IT Revolution Press
Published: 2021-11-29T21:00:00+00:00


Figure 14.3: One Line of Code to Generate Telemetry using StatsD and Graphite at Etsy

Source: Ian Malpass, “Measure Anything, Measure Everything.”

When we generate graphs of our telemetry, we will also overlay onto them when production changes occur, because we know that the significant majority of production issues are caused by production changes, which include code deployments. This is part of what allows us to have a high rate of change while still preserving a safe system of work.

More recently, the emergence of the OpenTelemetry standard has provided a way for data collectors to communicate with metrics storage and processing systems. There are OpenTelemetry integrations with all major languages, frameworks, and libraries, and most popular metrics and observability tools accept OpenTelemetry data.§

By generating production telemetry as part of our daily work, we create an ever-improving capability to not only see problems as they occur but also to design our work so that problems in design and operations can be revealed, allowing an increasing number of metrics to be tracked, as we saw in the Etsy case study.

Create Self-Service Access to Telemetry and Information Radiators

In the previous steps, we enabled Development and Operations to create and improve production telemetry as part of their daily work. In this step, our goal is to radiate this information to the rest of the organization, ensuring that anyone who wants information about any of the services we are running can get it without needing production system access or privileged accounts, or having to open up a ticket and wait for days for someone to configure the graph for them.

By making telemetry fast, easy to get, and sufficiently centralized, everyone in the value stream can share a common view of reality. Typically, this means that production metrics will be radiated on web pages generated by a centralized server, such as Graphite or any of the other technologies described in the previous section.

We want our production telemetry to be highly visible, which means putting it in central areas where Development and Operations work, thus allowing everyone who is interested to see how our services are performing. At a minimum, this includes everyone in our value stream, such as Development, Operations, Product Management, and Infosec. This is often referred to as an information radiator, defined by the Agile Alliance as,

the generic term for any of a number of handwritten, drawn, printed, or electronic displays which a team places in a highly visible location, so that all team members as well as passers-by can see the latest information at a glance: count of automated tests, velocity, incident reports, continuous integration status, and so on. This idea originated as part of the Toyota Production System.20

By putting information radiators in highly visible places, we promote responsibility among team members, actively demonstrating the following values:

•The team has nothing to hide from its visitors (customers, stakeholders, etc.).

•The team has nothing to hide from itself: it acknowledges and confronts problems.

Now that we possess the infrastructure to create and radiate production telemetry to the entire organization,



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.